Metacontrol for Adaptive Imagination-Based Optimization

نویسندگان

Jessica B. Hamrick

Andrew J. Ballard

Razvan Pascanu

Oriol Vinyals

Nicolas Heess

Peter W. Battaglia

چکیده

Many machine learning systems are built to solve the hardest examples of a particular task, which often makes them large and expensive to run—especially with respect to the easier examples, which might require much less computation. For an agent with a limited computational budget, this “one-size-fits-all” approach may result in the agent wasting valuable computation on easy examples, while not spending enough on hard examples. Rather than learning a single, fixed policy for solving all instances of a task, we introduce a metacontroller which learns to optimize a sequence of “imagined” internal simulations over predictive models of the world in order to construct a more informed, and more economical, solution. The metacontroller component is a model-free reinforcement learning agent, which decides both how many iterations of the optimization procedure to run, as well as which model to consult on each iteration. The models (which we call “experts”) can be state transition models, action-value functions, or any other mechanism that provides information useful for solving the task, and can be learned on-policy or off-policy in parallel with the metacontroller. When the metacontroller, controller, and experts were trained with “interaction networks” (Battaglia et al., 2016) as expert models, our approach was able to solve a challenging decision-making problem under complex non-linear dynamics. The metacontroller learned to adapt the amount of computation it performed to the difficulty of the task, and learned how to choose which experts to consult by factoring in both their reliability and individual computational resource costs. This allowed the metacontroller to achieve a lower overall cost (task loss plus computational cost) than more traditional fixed policy approaches. These results demonstrate that our approach is a powerful framework for using rich forward models for efficient model-based reinforcement learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RELIABILITY-BASED DESIGN OPTIMIZATION OF COMPLEX FUNCTIONS USING SELF-ADAPTIVE PARTICLE SWARM OPTIMIZATION METHOD

A Reliability-Based Design Optimization (RBDO) framework is presented that accounts for stochastic variations in structural parameters and operating conditions. The reliability index calculation is itself an iterative process, potentially employing an optimization technique to find the shortest distance from the origin to the limit-state boundary in a standard normal space. Monte Carlo simulati...

متن کامل

Airfoil Shape Optimization with Adaptive Mutation Genetic Algorithm

An efficient method for scattering Genetic Algorithm (GA) individuals in the design space is proposed to accelerate airfoil shape optimization. The method used here is based on the variation of the mutation rate for each gene of the chromosomes by taking feedback from the current population. An adaptive method for airfoil shape parameterization is also applied and its impact on the optimum desi...

متن کامل

Optimal Placement and Sizing of DGs and Shunt Capacitor Banks Simultaneously in Distribution Networks using Particle Swarm Optimization Algorithm Based on Adaptive Learning Strategy

Abstract: Optimization of DG and capacitors is a nonlinear objective optimization problem with equal and unequal constraints, and the efficiency of meta-heuristic methods for solving optimization problems has been proven to any degree of complex it. As the population grows and then electricity consumption increases, the need for generation increases, which further reduces voltage, increases los...

متن کامل

Adaptive Rule-Base Influence Function Mechanism for Cultural Algorithm

This study proposes a modified version of cultural algorithms (CAs) which benefits from rule-based system for influence function. This rule-based system selects and applies the suitable knowledge source according to the distribution of the solutions. This is important to use appropriate influence function to apply to a specific individual, regarding to its role in the search process. This rule ...

متن کامل

A limited memory adaptive trust-region approach for large-scale unconstrained optimization

This study concerns with a trust-region-based method for solving unconstrained optimization problems. The approach takes the advantages of the compact limited memory BFGS updating formula together with an appropriate adaptive radius strategy. In our approach, the adaptive technique leads us to decrease the number of subproblems solving, while utilizing the structure of limited memory quasi-Newt...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1705.02670 شماره

صفحات -

تاریخ انتشار 2017

Metacontrol for Adaptive Imagination-Based Optimization

نویسندگان

چکیده

منابع مشابه

RELIABILITY-BASED DESIGN OPTIMIZATION OF COMPLEX FUNCTIONS USING SELF-ADAPTIVE PARTICLE SWARM OPTIMIZATION METHOD

Airfoil Shape Optimization with Adaptive Mutation Genetic Algorithm

Optimal Placement and Sizing of DGs and Shunt Capacitor Banks Simultaneously in Distribution Networks using Particle Swarm Optimization Algorithm Based on Adaptive Learning Strategy

Adaptive Rule-Base Influence Function Mechanism for Cultural Algorithm

A limited memory adaptive trust-region approach for large-scale unconstrained optimization

عنوان ژورنال:

اشتراک گذاری